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METHODS  OF  SEARCH  FOR  SOLVING  POLYNOMIAL  EQUATIONS 
By  Peter  Henrici* 

Eidgenosaische  Technische  Hochschule 
Zurich,  Switzerland 

Dedicated  to  D.  H.  Lehmer  on  his  65th  birthday 


Abstract 

The  problem  of  determining  a  zero  of  a  given  polynomial  with  guaranteed 
error  bounds,  using  an  amount  of  work  that  can  be  estimated  a  priori,  is 
attacked  here  by  means  of  a  class  of  algorithms  based  on  the  idea  of  systematic 
search.  Lehmer' s  "machine  method"  for  solving  polynomial  equations  is  a 
special  case.  The  use  of  the  Schur-Cohn  algorithm  in  Lehmer' s  method  is 
replaced  by  a  more  general  proximity  test  which  reacts  positively  if  applied 
at  a  point  close  to  a  zero  of  a  polynomial.  Various  such  tests  are  described, 
and  the  work  involved  in  their  use  is  estimated.  The  optimality  and  non¬ 
optimality  of  certain  methodb,  both  on  a  deterministic  and  on  a  probabilistic 
basis,  are  established. 
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1.  Introduction 


In  1961  D.  H.  Lehmer  [6]  proposed  a  "machine  method"  for  solving  poly¬ 
nomial  equations.  His  algorithm  was  guaranteed  to  approximate  a  zero  of  any 
given  complex  polynomial  with  an  arbitrarily  small  error.  The  amount  of 
work  necessary  to  compute  a  zero  to  a  given  precision  could  be  estimated  a 
priori. 

In  the  present  paper  we  shall  describe  a  class  of  algorithms  for  poly¬ 
nomial  zerofinding  which  contains  Lehmer 's  method  as  a  special  case.  Our 
algorithms  borrow  from  Lehmer1  s  method  the  basic  idea  of  enclosing  zeros 
in  disks  of  decreasing  radius,  and  of  covering  disks  containing  a  zero  by 
smaller  disks.  However,  instead  of  using  a  special  procedure  to  determine 
whether  or  not  a  given  disk  contains  a  zero  of  a  polynomial,  the  algorithms 
discussed  here  merely  require  a  "proximity  test"  (§2)  which  reacts  positively 
if  applied  at  a  point  close  to  a  zero  of  the  given  polynomial.  Very  simple 
such  proximity  tests  exist,  and  as  a  consequence  some  of  our  algorithms  are 
arithmetically  simpler  than  Lehmer* s  method  (§3). 

The  convergence  of  the  general  search  algorithm  is  established  (§4), 
and  the  maximum  amount  of  work  necessary  to  determine  a  zero  to  a  preassigned 
accuracy  is  estimated  (§5). 

Among  the  class  of  all  proximity  tests,  we  then  identify  a  subclass  for 
which  the  convergence  of  the  resulting  algorithms  is  linear.  Among  these 
tests,  the  classical  Schur-Cohn  test  (which  forms  the  basis  for  Lehmer* s 
method)  is  shown  to  enjoy  a  certain  property  of  optimality  (§6).  We  finally 
discuss  the  best  covering  strategy  if  coverings  by  disks  of  constant  radius 
are  used.  From  a  deterministic  point  of  view,  the  best  strategy  consists 
in  covering  a  disk  of  radius  r  by  eight  disks  of  radius  q^r  ,  where 


WMWNU; 


*  (1  +  2  cos  2tt/7  ) ”1  =  0.41*504  .  From  a  probabilistic  point  of  view, 
if  coverings  by  disks  of  variable  radius  are  permitted,  Lehraer's  original 
covering  is  slightly  better,  although  not  optimal. 

Besides  Lehmer’s  paper,  the  present  study  was  inspired  by  the  methods 
of  search  used  in  the  constructive  proofs  of  the  fundamental  theorem  of 
algebra  due  to  Brouwer  [3,  4]  and  Rosenbloom  [10]. 

2.  Proximity  tests 

For  positive  integers  N  ,  let  PN  denote  the  class  of  all  monic 
polynomials  of  degree  N  with  complex  coefficients, 

p(z)  =  z  ♦  Vl*  *  —  **0  ’ 

whose  zeros  ,  Cg  ,  ...  ,  C,j  satisfy  |CJ  <  1  ,  i  =  1  ,  2  ,  ...  , 

N  .  It  is  our  objective  to  study  a  class  of  algorithms  for  solving  the 
following  problem:  Given  any  pePN  40(1  any  e  >  0  ,  to  construct  a  disk 
D  of  radius  e  which  contains  a  zero  of  p  .  The  algorithms  to  be 
discussed  are  uniformly  convergent  on  P^  ,  in  the  following  sense:  The 
amount  of  work  necessary  to  construct  D  is  bounded  by  a  quantity  which 
depends  on  E  and  N  ,  but  not  on  the  individual  polynomial  p  . 

The  basic  tool  of  the  algorithms  to  be  described  is  a  proximity  test 
T  =  T(r)  ,  which  can  be  applied  to  any  polynomial  pEPN  at  any  point  z 
such  that  |z  |  <  1  ,  and  which  the  polynomial  either  passes  or  fails.  The 
test  must  be  such  that  it  is  passed  at  all  points  z  sufficiently  close  to 
a  zerc,  and  failed  at  all  points  sufficiently  far  away.  (There  may  be  an 
in-between  region  where  the  test  may  be  passed  or  failed.)  The  parameter 
r  regulates  the  difficulty  of  the  test.  The  smaller  r  is,  the  more 
difficult  it  becomes  to  pass  the  test. 
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Speaking  formally,  a  teat  T(r)  is  called  a  proximity  test  if  there 
exiat  two  positive  functions  ^  and  f  ,  defined  on  some  interval 
0  <  r  <  r^  and  having  the  following  properties:  If  p  is  any  polynomial 
in  PN  ,  and  if  Q  is  any  aero  of  p  ,  then  for  all  re(o,rQ] 

(i)  p  passes  T(r)  at  all  points  z  such  that  |z|  <  1  and 

|z  -  cl  <  $(r)  ; 

(ii)  p  fails  T(r)  at  all  points  z  such  that  |z|  <  1  and 

|z  -  Cl  >  t(r)  . 

The  above  evidently  implies  that  ^(r)  <  f(r)  ;  we  do  not  require  that 
^  =  if  ,  We  postulate  that  T(r)  becomes  arbitrarily  difficult  to  pass  for 
r  -♦  0  ,  i.e. , 

(iii)  Urn  t(r)  *  0  . 

r  *-»  0 


We  furthermore  require 

(iv)  f  is  continuous  and  strictly  monotonically  increasing. 

The  functions  <}  and  t  are  called,  respectively,  the  inner  and 
outer  convergence  function  of  the  test  T(r)  . 

The  following  test,  to  be  denoted  by  T1  ,  may  serve  as  a  first  example 
of  a  proximity  test: 

"  p  passes  T1(r )  at  z  "  <==>  |p(z)  |  <  r  . 
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To  show  that  this  test  has  the  required  properties  for  0  <  r  <  1  ,  let 


N 

p(z)  «  n  (z  -  t)  • 
i«i 


If  p  fails  the  test  at  z  ,  then 


|p(s)| 


N 

n  |z  -  Ctl  >  r  . 
i-1 


Hence  for  every  i  , 


I*  -  Cj  > 


N 

r  n  |z 


-  r1  • 


Since  |Cj  |  <  1  ,  |z|  <  1  ,  every  factor  of  the  product  on  the  right  is 

at  least  l/2  ,  and  we  find  that 


|»  -  Cl  |  <  2  r,  i  5  1  .  B  > 


Hence  T-^r)  cannot  be  failed  if  |z  -  ^  |  <  2’N+1r  for  some  i  ,  and  (i) 
is  true  for 


4(r)  =  2-t,+1r  . 

If,  on  the  other  hand,  p  passes  T^r)  at  7  ,  then 

N 

n  |«  *  C.  I  <  r  , 

i-l  1  ' 


k 


t 


and  it  follows  that 


for  at  least  one  Index  i  .  Thus  the  test  cannot  be  passed  if 
jz  -  CjJ  >  r1/11  for  all  i  ,  and  we  find  that  (ii)  is  true  for 

♦<r)  =  r1/"  . 

(By  considering  a  polynomial  with  a  single  zero  of  multiplicity  N  ,  we 
see  that  (ii)  is  not  true  for  any  smaller  function  f  . )  It  is  clear 
that  f  has  the  propevties  (iii)  and  (iv). 

Two  tests  are  called  equivalent  if  they  are  defined  on  the  same  domain 
of  r  and  if  they  produce  identical  results  for  all  polynomials  p  at 
all  points  z  and  for  all  values  r  . 

Example:  The  test  T-,  is  equivalent  to  a  test  which  is  declared 

2  2 

passed  if  and  only  if  |p(z)|  <  r  . 

Two  proximity  tests  T  and  T*  are  called  similar  if  there  exists 
an  increasing  function  r#  mapping  [0,rg]  onto  an  interval  [0,rg]  such 
that  the  test  T(r)  is  equivalent  to  T*(r)  *  T(r#(r))  .  Similar  tests 
thus  differ  only  in  the  choice  of  the  parameter.  It  is  clear  that  the 
similarity  of  tests,  too,  is  an  equivalence  relation. 

Example:  The  test  T^  is  similar  to  the  test  T*(r)  which  is  passed 
if  and  only  if  |p(z)  |  <  rN  .  Convergence  functions  for  T£  are  $(r)  = 
2”N+1r^  and  f(r)  «  r  . 

By  (iv),  every  proximity  test  is  similar  to  a  test  with  outer  con¬ 


vergence  function  f(r)  »  r  . 
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3.  The  search  algorithm 

We  require  the  notion  of  an  e-covering.  If  e  is  any  positive  number, 
and  if  S  is  any  set  in  the  complex  plane,  an  e -covering  of  S  is  any 
system  of  closed  disks  of  radius  <  c  whose  union  contains  S  .  The  covering 
is  said  to  be  centered  in  S  if  the  midpoints  of  the  covering  disks  belong 
to  S  .  The  construction  of  a  minimal  e-covering  of  a  given  bounded  set 
(i.e.,  a  covering  containing  the  least  number  of  disks)  can  raise  intricate 
questions  of  elementary  geometry.  Of  course,  one  can  always  use  coverings 
whose  centers  form  a  square  or  hexagonal  grid. 

Let  peP  ,  let  T  be  a  proximity  test,  and  let  f q  )  be  a  mono- 
tonic  sequence  of  positive  numbers  converging  to  zero  such  that  qg  *  1  . 

We  shall  describe  an  algorithm  for  constructing  a  sequence  of  points  {z^} 
such  that  each  of  the  disks 


D.  =  {z  j  |: 


<  » 


k  =  0  ,  l,  2,...  ,  contains  at  least  one  zero  of  p  . 

Let  z0  =  0  .  Then  Dq  certainly  contains  a  zero,  for  it  contains 

all  zeros.  The  algorithm  now  proceeds  by  induction.  Suppose  we  have 

found  a  point  z,  ,  such  that  D,  ,  contains  a  zero.  To  construct  z,  , 
k-l  k-l  k 

we  cover  the  set  Dk_1  fl  Dq  with  an  e^-covering  centered  in  it  and 
apply  a  test  T(r^)  at  the  "enter  of  each  covering  disk.  The  parameters 
and  rk  are  chosen  such  that  the  following  two  conditions  are  met: 

(A)  The  test  is  passed  at  the  center  of  each  disk  of  the  covering 
which  contains  a  zero. 

(B)  Any  point  at  which  the  test  is  passed  is  at  a  distance  < 
from  a  zero. 


Condition  (A)  is  satisfied  if  ek  <  4(rk)  •  Condition  (B)  is  satisfied 
if  f(r.  )  <  q.  •  Thus  both  conditions  are  fulfilled  if 

rk  =  ♦'1(qk)  . 

(1) 

ek  -  +<rk)  .  ♦(♦'1(rk))  , 

where  f  1  denotes  the  inverse  function  ol  f  . 

At  least  one  of  the  covering  disks  contains  a  zero,  since  Dk_1  contains 
one,  and  since  all  disks  are  contained  in  Cq  .  Thus  by  (A),  the  test 
T(r. )  is  passed  at  least  once.  We  let  z.  be  the  first  center  at  which 
the  test  is  passed.  There  is  no  assurance  that  the  disk  of  radius  ek 
surrounding  zfc  actually  contains  a  zero,  but  by  (b),  the  disk  Dk  does. 

The  whole  algorithm  thus  may  be  summarized  as  follows:  Let  zQ  =  0  . 
Having  constructed  zk_1  ,  cover  the  set  Dk_1  ft  DQ  by  an  e^-covering 
centered  in  it,  and  apply  T(rk)  at  the  center  of  each  covering  disk,  where 
ck  and  rk  are  given  by  (l).  Let  zk  be  the  first  center  which  passes 
the  test. 

Provided  that  identical  systems  of  converings  are  used,  the  above 
algorithm  remains  unchanged  if  the  test  T  is  replaced  by  a  "similar" 
test  T*  . 

4.  Convergence 

By  construction,  the  centers  zk  of  successive  disks  Dk  satisfy 
|zk+1  -  zk|  <  qk  >  where  qk  -»  0  .  This  in  itself  does  not  imply  the 
convergence  of  the  sequence  {zkJ  .  Nevertheless,  there  holds 

THEOREM  1.  The  sequence  {zk}  converges .  and  its  limit  is  a  zero  of  p  . 


« 


Proof.  Let 


6  =  min  Jc  -  { 


be  the  minimum  distance  between  distinct  zeros  of  p  .  Let  m  be  an  integer 
such  that  2q^  <  6  .  Let  n  >  m  .  The  disk  contains  a  zero,  say  . 

The  disk  Dk+1  likewise  contains  a  zero,  say  .  From 


K  *  Cl  I  S  «„  >  IVl'CjliVl 


it  follows  by  the  monotonicity  of  the  sequence  {a  }  that 


ICi  -Cjl<4n*ViSa»„<‘ 


and  hence  that  ^  .  •  Thus  for  n  =  m  ,  |zn  -  \  <  qR  ,  proving 


lim  z  =  f.  . 
n  *i 


5.  Amount  of  work 

We  measure  the  amount  of  work  required  to  approximate  a  zero  with  an 
error  <  c  by  estimating  the  number  of  applications  of  the  test  T  required 
to  construct  the  first  disk  Dk  such  that  its  radius  qk  is  less  than  e  . 
For  reasons  of  simplicity  we  assume  until  further  notice  that  the  centers 
of  the  covering  disks  always  form  a  square  grid. 


The  area  of  D  ^  is 

of  the  covering  disks  must  be  not 
boundary  effects,  approximately 


disks  of  radius  e  are  thus  required  to  cover  D  ,  .  (Working  with  a 
m  m-i 

a 

hexagonal  grid,  the  constant  ?f  could  be  replaced  by  .)  Within 

the  same  degree  of  approximation,  this  also  is  the  maximum  number  of  appH  - 

cations  of  the  test  to  proceed  from  z  ,  to  z  . 

m-i  m 

For  the  given  sequence  {q^}  and  for  e  >  0  ,  let  k(e)  denote  the 
smallest  k  such  that  <  c  •  By  the  above,  the  total  number  of  appli¬ 
cations  of  the  test  necessary  to  approximate  a  zero  with  an  error  <  t  does 
not  exceed  a  quantity  of  the  order  of 

_  h(c)  q?  -i 

(2)  w(T,{qkJ,c)  -  §  T  • 

m=1  cm 

We  axiomatically  define  the  above  function  w  as  the  work  function  cf  the 
search  algorithm  based  on  the  proximity  test  T  and  the  sequence  {q^}  . 

The  work  function  does  not  change  if  the  test  T  is  replaced  by  a  similar 
test  T*  . 

From  the  fact  that  w  does  not  depend  on  p  it  already  follows  that 
the  search  algorithms  described  earlier  are  uniformly  convergent  in  the 
sense  described  earlier. 


.  In  a  square  ^-covering,  the  centers 
more  than  ^  apart.  Neglecting 


2 

n  Sii-1 
2  2 


m 
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Example .  For  the  test  ,  choosing  a  geometric  mode  of  subdivision 

(q^  =  qk,  0  <  q  <  1  ,  k  =  0  ,  1,  2,...)  we  have  in  view  of  <|(r) 

0-N+l  v  l/N 

2  r  ,  f(r)  =  r  < 


N+l  mN 


%  - 


hence 


wij.fa11},.)  - 1 22"-2  V  q2m-2-2"M  ~cN  q-<2K-2)k<'> 

m=l 


( e  -»  0)  ,  where 


2N-2 

CN  =  2  2  2N  * 

q  -q 


For  the  determination  of  a  zero  of  a  polynomial  of  degree  10  with  an  error 

<  10  ^  ,  working  with  q  =  ^  (which  requires  k  =  20  )  the  function  w 

397  120 

yields  an  upper  bound  of  approximately  2  tt  =  10  applications  of  the 

test.  Since  on  the  average  we  can't  expect  to  do  much  better  than  use  one 
half  of  the  maximum  number  of  tests,  a  search  algorithm  based  on  T.^ 
certainly  is  not  practical. 


6.  Proximity  tests  with  linear  convergence  functions 

Suppose  the  convergence  functions  of  a  proximity  test  T  are  linear, 


$>(r)  =  ar  ,  f(r)  =  br 


(0  <  a  <  b)  .  Then  by  (l), 


»an«iii»W.  '-MMi*.  .ALVMjib&X  tffttf^taaaW*****"****" 


» 


II 

B 

R 

0 

fi 

n 

n 

n 

n 

n 

ii 


s,  *  ♦<r1(flB»  -!%,  - 

and  the  work  function  (2)  becomes 

2  k(e)  <L  , 

(4)  wd.faj,.)  •§**  r  -V  • 

K  d  a  m=l 

In  particular,  if  q  =  q  , 

K 

(5)  vd.fo*],,)  .  Hz)  ■ 

2a  q 

aiid  the  work  necessary  to  compute  a  zero  to  a  given  accuracy  is  proportional 
to  the  number  of  decimals  required.  This  convergence  behavior  is  known 
as  linear  convergence. 

We  now  shall  give  some  examples  of  proximity  tests  with  linear  con¬ 
vergence  functions.  For  arbitrary  z  and  h  ,  let 


0 

ii 

0 

i 

l 

1 

I 
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p(z  +  h)  =  bQ  +  b^h  +  bgh2  +  ...  +  bNhW 


(b^  =  l)  .  It  will  be  convenient  to  suppress  the  argument  z  in  the  Taylor 
coefficients  b.  . 

l 

6.1.  The  test  T 2  .  Let 

B  =  B(z)  =  min 
l<k<N 

The  polynomial  p  is  said  to  pass  the  test  T2(r)  at  z  if  and  only  if 
B(z)  <  r  .  To  determine  the  convergence  functions  of  this  test,  let 


11 


« 


p  =  min  \z  -  r  |  . 
l<k<N  K 


The  relations  of  Vieta  imply,  as  is  well  known, 

N  ^0 

P  =  b*~  >  k  =  1  >  •  •  •  ,  N  • 

k 

N  1/k 

Since  (fc)  <  N  ,  this  implies  p  <  NB(z)  .  Hence  if  p  >  Nr  ,  then 

B(z)  >  r  ,  and  p  fails  T2(r)  at  *  .  It  follows  that 

t(r)  =  Nr 

is  outer  convergence  function  for  T2  .  On  the  other  hand,  let  p  fail 
the  test  at  z  .  Then  B  >  r  and  hence 


t“<rk,  k  *  1  ,  2  ,  ,  N  . 


If  p(z  +  h)  0  and  |h  |  =»  p  ,  the  Taylor  expansion  shows  that 


2  N 

r  _2  +  '  ’ '  +  N  *  1 

r  r 


and  hence  that  r  >  g  •  It  follows  that  the  test  cannot  be  failed  if 


P  <  £  r  ,  i.e. , 


4<r)  -  £  r 


is  inner  convergence  function  for  T2 
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Thus  Tg  has  convergence  functions  of  the  form  (j);  we  note  that 
—  ■  2N  .  In  the  numerical  example  considered  earlier  (N  =  10  , 
e  *  10  ^  =  2~k)  ,  (4)  now  furnishes  an  upper  bound  of  some  SO, 000 

applications  of  the  test. 

6.2.  The  test  T^  .  The  polynomial  is  said  to  pass  T^(r)  a+  z 
if  and  only  if 


!bo I  ^  lbilr  +  lb2lr2  +  •••  +  lbNlrN  • 

Let  p  be  defined  by  (6).  Then  for  some  h  such  that  |h  |  =  p  we  have 
p(z  +  h)  =0  ,  hence 


!bo I  i  lbJp  +  lb2lpd  +  •••  +  lbKlpN  ’ 


and  p  passes  T^(p)  .  Thus  ^(r)  =  r  is  inner  convergence  function  for 
this  test.  On  the  other  hand,  a  theorem  of  0.  D.  Birkhoff  [2]  implies 
that  the  test  cannot  be  passed  if  p  >  (2^N  -  l)  1r  .  Thus 


♦(r)  *  p7IT7 


is  outer  convergence  function.  For  this  pair  of  convergence  functions, 


b  1  N  /„  \ 


For  a  given  sequence  {q.  }  ,  and  for  linear  convergence  functions  (3), 


2/2 


the  value  of  the  work  function  for  a  given  e  is  proportional  to  b  /a  . 
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« 


For  both  tests  Tp  and  T,  this  ratio  is  oCn2)  as  N  -*  ®  .  This 

™  J  o 

situation  is  typical  for  any  teat  that  depends  only  on  the  absolute  values 
|b^|  ,  for  it  is  known  [  9»  1]  that  the  maximum  of  the  ratio  of  the  largest 
and  smallest  absolute  value  which  the  smallest  aero  of  a  polynomial  of  degree 
N  can  have  if  the  absolute  values  of  the  coefficients  are  fixed  is  precisely 
(2^N  -  l)”1  .  It  follows  that  smaller  values  of  b/a  can  be  achieved  only 
with  tests  that  do  not  merely  use  the  absolute  values  of  the  Taylor  coeffi¬ 
cients. 

6.3.  The  test  T^  .  This  test  makes  use  of  the  sums 
N  -k 

(7)  sv  =  £  (C-j  "  2)  >  k  =  1  ,  2  ,  . 

K  i=l  1 

It  is  easily  shown  by  means  of  a  generating  function  argument  that  these 
quantities  can  be  computed  from  the  Taylor  coefficients  at  z  by  means 
of  the  following  recurrence  relation: 

Sk  =  ”  b01^kbk  +  slbk-l  +  s2bk-2  +  +  sk-lbl^  » 

k  =  1  ,  2  ,  ...  . 

Let  p  be  defined  by  (6).  Then  |sk|  <  Np'k  ,  k  =  1  ,  2  ,  ,  and 

it  follows  that 


P  < 


1/k 

I 


Ik 


(8) 


k  =  1 


J 


2 


)  •  •  •  • 


Let 


s  =  min 
l<k<N 

a  a 

We  say  that  p  passes  the  test  T^(r)  at  z  if  and  only  if  S  <  r  ,  It 
follows  from  (8)  that 


t(r)  =  r 

is  outer  convergence  function  for  this  test.  Moreover,  a  rather  deep  result 
of  Buckholtz  [5]  states  that  S  <  (2  +  2/Z)q  ,  where  the  numerical  constant 
is  best  possible.  It  follows  that 

tfr)  =  (2  +  2/Z)'lr 

is  inner  convergence  function.  For  this  pair  of  convergence  functions,  the 
ratio  b/a  =  2  +  2/T  =  4.8284  is  independent  of  N  . 

6.4.  Sharp  tests.  For  a  given  sequence  {q^}  ,  and  for  linear  con¬ 
vergence  functions  <|>  and  if  >  the  value  of  the  work  function  (4)  for 
given  c  is  a  minimum  for  a  test  such  that  b  =  a  .  Without  loss  of 
generality  it  may  be  assumed  that  b  =  a  =  1  .  A  test  with  convergence 
functions  «r)  =  +(r)  =  r  will  be  called  sharp.  A  sharp  test  reacts 
positively  if  and  only  if  the  closed  disk  of  radius  r  about  the  testing 
point  z  contains  a  zero.  Thus  all  sharp  tests  belong  to  the  same  class 
of  equivalent  tests. 
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There  exist  several  realizations  of  sharp  tests.  They  are  based  either 
on  a  conformal  mapping  of  the  disk  onto  the  left  half -plane,  followed  by  the 
Routh-Hurwitz  algorithm,  or  (more  directly  and  efficiently)  on  the  well- 
known  Schur-Cohn  algorithm  ([8],  p.  195)  for  counting  the  number  of  zeros 
in  a  given  disk.  Lehmer's  method  [6,  7)*  the  first  search  algorithm  of  the 
type  considered  here,  was  based  on  the  Schur-Cohn  algorithm. 

In  our  numerical  example  (N  =  10  ,  =  2  k  ,  e  =  10  ^),  (5)  now 

yields  a  maximum  of  a  mere  129  tests  in  an  algorithm  based  on  a  sharp  test. 

Due  to  neglect  of  boundary  effects,  the  true  maximum  is  somewhat  higher; 
see  below. 

The  mere  fact  that  the  work  function  is  smallest  for  the  Schur-Cohn 
test  does  not  in  itself  imply  that  this  test  defines  the  computationally 
most  efficient  algorithm,  since  the  work  function  does  not  take  into  account 
the  work  required  to  carry  out  the  test.  In  the  absence  of.  rigorous  results 
concerning  the  minimum  number  of  arithmetic  operations  required  to  administer 
the  various  tests,  precise  results  are  difficult.  Suffice  it  to  say  that 
all  tests  described  in  this  section  require,  among  other  things,  all  Taylor 
coefficients  at  z  .  If  performed  by  the  Horner  algorithm,  their  computation 
requires  ^N2  +  0(n)  multiplications.  The  Schur-Cohn  algorithm,  if  programmed 
in  the  superior  fashion  recommended  by  Stewart  [11],  requires  another 

+  0(n)  multiplications  and  divisions,  roughly  the  same  as  the  computation 
of  the  sums  s^  required  for  T^  .  Thus  the  Schur-Cohn  test  requires  only 
about  twice  as  much  work  as  T2  or  T^  ,  and  about  the  same  as  T^  . 

7.  Optimum  choice  of  fq,  ] 

k  i 

Suppose  the  search  algorithm  is  based  on  a  test  with  linear  convergence 
functions  (j).  If  e  is  given,  for  what  choice  of  the  sequence  (qkJ  is 
the  work  function  w(T,{qkJ,e)  a  minimum? 
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We  first  answer  this  question  when  k(c)  is  prescribed.  Let  c  >  0  » 
Let  k  be  a  given  positive  integer,  and  let  (q^)  be  any  decreasing 
sequence  such  that  q^  «  1  ,  q^  ■  c  .  Then,  by  the  inequality  of  the 
arithmetic  and  geometric  mean. 


W(T,  (c-jjj}*  e) 


k 


C  E 
m=l 


>  Ck 


k 


mm  AU- JL 

n.  2 
m«l 


1/k 


Ck  t 


■2/k 


-  W(I,  {c^k],.)  , 


and  we  have  proved: 

THEOREM  2.  £§t  c  >  0  k  >  0  0njth^^£&C6^o^^l^ 

monotonic  sequences  {q^  such  that  =  1  jyjd  =  c  ,  the  work  function 
(4)  assumes  its  smallest  value  for  the  geometric  sequence.  ^  , 

m  *  0  ,  l,  2  ,  . . .  • 

On  the  basis  of  this  result,  we  now  restrict  our  attention  to  geometric 
sequences,  qjn-qm(0<q<l)  ,  and  ask  for  the  optimal  value  of  q  to 
achieve  a  given  accuracy  e  .  As  a  function  of  q  and  e  ,  k(e)  is  now 
the  smallest  integer  such  that  q  <  e  or 

«•>--  [-sk]  • 
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where  [x]  denotes  the  largest  Integer  <  x  .  Neglecting  a  fractional  part, 
we  thus  have  approximately 

w(T,{qk),e)  i  0  1*26-1- 

q  log  q 

(C  defined  as  above).  By  differentiation  we  easily  find  that  the  minimum 

of  the  above  expression  is  attained  for  q  =  e  =  0.60653  >  and  that  the 

value  of  the  minim  m  is  2  e  C  log  —  . 

e 

Unfortunately,  the  above  result  does  not  indicate  accurately  the 
maximum  number  of  tests  to  be  applied,  because  the  method  of  counting  the 
covering  disks  underlying  (2)  becomes  increasingly  inaccurate  (due  to  the 
neglect  of  boundary  effects)  if  the  ratio  of  the  radii  of  the  covering  disks 
and  of  the  disk  to  covered  approaches  1  .  To  determine  the  exact  maximum, 
let,  for  0  <  x  <  1  ,  f(x)  denote  the  minimum  number  of  disks  of  radius 
x  that  are  required  to  cover  the  unit  disk.  The  function  f  is  non¬ 
increasing,  piecewise  constant,  and  continuous  from  the  right;  no  simple 

analytical  expression  for  it  exists.  To  proceed  from  z  to  z  in  a 

m  m+i 

search  algorithm  based  on  a  test  with  linear  convergence  functions  and  on  a 
geometric  sequence  {qm}  requires  covering  a  disk  of  radius  qm  by  disks 
of  radius  g  qm+1  .  Hence,  if  an  optimal  covering  is  used,  at  most  f (-  q) 
applications  of  the  test  are  necessary.  The  actual  maximum  number  of 
tests  to  attain  an  error  <  e  thus  equals 

W(a,b,q,E)  =  -  f(|q)  £  . 
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We  shall  determine  the  minimum  of  W  as  a  function  of  q  for  the  Schur- 
Cohn  test  (a=b=l)  . 

THEOREM  3.  For  sufficiently  small  fixed  values  of  t  ,  the  function 
F(q,e)  =  W(l,l,q,E)  assumes  its  minimum  at  q  =  q^  =  (1  +  2  cos  7^)  ^  . 
The  value  of  the  minimum  is 


F(qo»e)  =  -  8 


.  log— £, 

=  -  8 

<• 

log  g 

L  log  v 

0.8096 

Proof.  We  first  determine  the  minimum  of  the  function 


o(q)  =  f(q) 


1£^.l  ,£ 
log  q 


Let  the  points  of  discontinuity  of  f  be,,  in  decreasing  order,  1  =  xQ  > 

X1  >  x2  >  *  “  »  and  let  the  constan't  value  of  f  in  the  interval 

x^  <  x  <  x  ,  be  denoted  by  f  (m  =  1,  2,  ...)  .  Then  G(q)  is  increasing 
m  -  m-l  m 

in  each  of  the  intervals  x  <  q  <  x  ,  ,  and  has  a  downward  jump  at  the 

m  =  ra-i 

points  x  (m  =  1,  2,  ...)  .  It  thus  is  smallest  where 
m 


is  smallest.  It  can  be  shown  that 


xm  =  (2  C0S  ^)_1  ’  fB  =  m  +  2  f0r  ra  =  1  ’  2  >  ^  J 
Xjn  =  (!  +  2  cos  “Jg)"1  ,  fm  =  m  +  3  for  m  -  4  ,  5  ,  6  . 
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From  these  values  and  from  the  trivial  estimate  f(x)  >  x  it  follows 
by  computation  that  the  minimum  is  assumed  only  at  q^  **  x^  = 

(1  +  2  cos  —I)’1  =  O.UI+50I+  ,  and  that  it  has  the  value 

°<lo>  -  8  4  S-882  108  '_1  • 

The  function  F  has  the  form  F(q)  =  f(q)h(q)  ,  where 

The  function  h  is  piecewise  constant,  nondecreasing,  and  continuous 
from  the  left.  We  denote  its  points  of  discontinuity  by  0  <  hQ  <  h^  < 
h2  <  ...  .  Evidently,  F(q)  >  l(q)  ,  with  equality  holding  if  and  only 
if  q  =  h  for  some  n  .  Let  n*  be  the  smallest  index  n  such  that 

h  >  q^  .  For  sufficiently  small  values  of  e  ,  the  points  hR  are 
arbitrarily  dense,  hence  h^  <  x^  ,  and  furthermore 

F(hn*)  <  G(xm)  ,  m  ^  5  . 

It  follows  that  F(hn^)  is  the  smallest  value  of  F  .  If  hn*  =  qQ  , 
the  Theorem  is  established.  If  hn#  >  q^  ,  the  Theorem  follows  from  the 
fact  that  F(q)  is  constant  for  q^  <  q  <  h  . 

The  optimal  covering  of  the  unit  disk  by  8  disks  of  radius  q^  consists 
of  a  disk  centered  at  the  origin,  surrounded  by  7  disks  centered  at  the 
points 

2rrik 
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where 


R  = 


cos  ~ 


1  +  2  cos  — 


2tt 

7 


0.80194  . 


8.  Non-uniform  converings 

So  far  in  this  study,  it  was  assumed  that  the  covering  of  each  disk 

D.  consists  of  disks  of  constant  radius.  It  is  a  trivial  matter  to 
K 

modify  the  definition  of  the  basic  search  algorithm  to  permit  coverings 
of  variable  radius  and  to  extend  the  convergence  theorem  to  this  case. 

Also  the  upper  bounds  for  the  amount  of  work  are  easily  adapted  to  extend 
to  such  non-uniform  coverings. 

However,  the  optimality  considerations  of  section  7  strongly  depend 
on  the  constancy  of  the  radii  of  the  covering  disks,  and  it  is  far  from 
obvious  how  they  should  be  modified  for  non-uniform  coverings.  Tt  appears 
certain,  however,  that  the  methods  using  uniform  coverings  are  not  optimal 
in  the  class  of  methods  using  arbitrary  coverings. 

The  efficiency  of  an  algorithm  can  also  be  judged  from  a  probabilistic 
point  of  view,  for  instance  by  computing  the  average  number  Z  of  appli¬ 
cations  of  the  test  required  to  improve  the  accuracy  of  a  zero  by  one 
decimal  digit.  Here  again  the  methods  using  uniform  coverings  are  not 
optimal.  For  the  optimal  method  using  uniform  coverings  determined  in 
Theorem  3,  it  can  be  shown  that 

Z  =  11.168  . 
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centered  at 


Lehmer's  method  covers  the  unit  disk  by  a  disk  of  radius 
0  ,  and  by  8  disks  of  radius  ^  centered  on  a  circle  of  radius  ^  . 

For  this  covering,  if  the  sequence  of  surrounding  disks  is  chosen  optimally 
as  suggested  in  [6], 

z  =  11.143  . 

It  can  be  shown  that  Lehmer's  coverings  is  again  not  optimal,  if  only  by 
the  trivial  reason  that  it  has  some  built-in  slack  to  counteract  rounding. 
The  detailed  investigation  of  optimal  non-uniform  coverings  must,  however, 
wait  for  another  paper. 
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